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Abstract 

Document ranking based on probabilistic evaluations of relevance 
is known to exhibit non-classical correlations, which may be explained 
by admitting a complex structure of the event space, namely, by as- 
suming the events to emerge from multiple sample spaces. The struc- 
ture of event space formed by overlapping sample spaces is known 
in quantum mechanics, they may exhibit some counter- intuitive fea- 
tures, called quantum contextuality. In this Note I observe that from 
the structural point of view quantum contextuality looks similar to 
personalization of information retrieval scenarios. Along these lines, 
Knowledge Revision is treated as operationalistic measurement and a 
way to quantify the rate of personalization of Information Retrieval 
scenarios is suggested. 

1 The evolution of information needs 



The notion of information needs was clearly formulated by Tailor [I2j . Along 
with the development of IR systems the very structure of information needs, 
of queries was subject to evolution. Briefly, its mainstream can be described 
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transition (read upwards) 



Knowledge Revision (KR) 



Information Retrieval (IR) 



Data Retrieval 

(1) 

each stage using the previous one as a background. Information Retrieval 
uses Data Retrieval environment yet modifying the structure of queries, as 
formulated by Lancaster "An information retrieval system does not inform 
(i.e. change the knowledge of) the user on the subject of his inquiry. It 
merely informs on the existence (or non-existence) and whereabouts of doc- 
uments relating to his request" [5J. Then the next stage is the increasing 
personalization of search. The user interacts with an IR environment having 
a goal to update the state of his knowledge (belief) rather than to retrieve a 
particular document. This way Information Retrieval serves for Knowledge 
Revision (KR). 

How quantum mechanics comes? The chain (Op) can be compared with the 
transition from classical mechanics, dealing with the absolute character of the 
values measured, to quantum mechanics, where the result of a measurement 
is a result of an act of will of an observer rather than retrieving a pre- 
existing value. In both extreme cases, the retrieval act is nothing but a 
measurement. Similar to the evolution of the notion of measurement, the 
retrieval metaphors evolve. 

We shall deal and with the general notion of Information Needs (IN), 
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ranging them in four levels [12] 



visceral need 



conscious need 



formalized need 



compromised need 

(2) 

with the following meaning 

• The visceral need is the actual, but unexpressed, need for information. 

• The conscious need is a within-brain description of the need. 

• The formalized need is a formal statement of the question. 

• The compromised need is the question as presented to the information 
system. 

The chain ((T]) reflects the upwards transition in the above list, and the per- 
sonalization tightly approaches to the visceral IN. In this Note I deal with the 
quantification of personalization - the crucial part of Knowledge Revision - 
using quantum metaphor. The technical basis for this quantitative approach 
is formed of the following research lines: 

• Simulation of quantum contextuality effects by finite automata and 
the evaluation of the amount of memory required for this simulation 
[I]. Our basic idea is to revert this argumentation and to evaluate the 
features of a quantum system, which can be in certain sense simulated 
by giver IR environment. 
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• The evaluations of violations of classical propbabilistic laws by index 
term probabilities, carried out by Melucci [7] and the quantitative eval- 
uation of the amount of contextuality by Svozil [IT] 

2 On the nature of non-classical correlations 

In general, non-classical correlations appear when Kolmogorovian probability 
model is no longer applicable. The basic point of Kolmogorovian model is 
the existence of a (single) sample space fl. The events are subsets of Q, while 
the points of the sample space are elementary and independent. 

In order to test this or that model, we employ Accardi's statistical in- 
variants [2J, they allow to test the applicability of Kolmogorovian model. 
Given: 

• a family of discrete maximal observables {A a : a — 1, . . . T} (T being 
finite), each observable A a takes the finite number of values a^j labelled 
by ja = 1, ■ ■ - ,n 

• the experimentally measurable conditional probabilities Pj ai j p {/3 \ a) 

p SaJ ,(P | a) = P (A* = afj\ A a = a£?) (3) 

The problem is: does there exist a probability space (ft; T\ P) and T mea- 
surable partitions A^ of cardinality n (the number of distinct values of each 
observable is assumed to be the same) 

A { ;\a = l,...T,j = l,...n 
such that for any a, f3 = 1, . . . T one has 

P (Af U A ia) 

p(a 



P(^„f|^ a <«>)^ w ■ / (4) 



In order to get the answer, a linear programming problem is to be solved 
PP, that is, the problem of the existence of a single sample space is finitely 
decidable. 
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In the sequel we shall need the special case of three observables A,B,C, each 
taking only two values ai, a 2 for A, bi, b 2 for B and ci, c 2 for the observable 
C. The transition probability matrices for each pair of observables, being 
bistochastic, each has only one numeric parameter, denote the appropriate 
matrices as 



then these transition probabilities can be described by a Kolmogorovian 
model (that is, they are produced by a single sample space) if and only 



3 The operationalistic metaphor 

There is a straightforward analogy between IR and the process of measure- 
ment. There is a search machine, which we may treat to be prepared in 
certain state, and there is an observer, which performs a measurement. It is 
typical that the preparation of query system does not assume a query asked 
by the user, this causes a mismatch, which is to be handled. 

The situation when a mismatch between the preparation and measure- 
ment occurs is a source of paradoxes and counter-intuitive observable con- 
sequences of quantum mechanics. It results in the possible randomness of 
single accounts, though previous stages were deterministically prepared. To 
deal with it, context translation is introduced as handling the mismatch 
between state preparation and measurement. In quantum mechanics this 
metaphorically looks as follows [ID] . Suppose an electron is prepared, using 
Stern-Gerlach device, in pure spin stat along z axis, always showing spin up. 
Then we decide to ask the so-prepared electron a complementary question: 
"what is direction of spin along the x axis?" Quantum mechanics tells us 
that the electron is completely incapable to store more than one bit of infor- 
mation (assuming this is not so leads to direct experimental contradictions). 
That is why the electron gives a random reply on this query. This is what 





if 



\p + q — 1| <r< 1 — \p — q 



(6) 
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makes it different from deterministic query agents, who are not able to handle 
improper input, on which they offer no answer. 

Modern IR environments are no longer so rigid, they easily handle any 
kind of input: if you ask them, almost always you get an answer, but some- 
times the relevance of this answer for you personally may be of zero value. 
To overcome this, search engines are configured to track user's requests, or, 
in other words, to keep the context associated with particular user and his 
present role. Altogether, each such particular action I call knowledge revi- 
sion scenario. In practice this is done by seeding pebbles along the way the 
user goes through the jungles of World Wide Web, say, by storing browser's 
cookies. These pebbles are, after all, just sequences of bits. Now suppose 
our task is to judge to what extent the act of measurement is personalized, 
let us view it from a perspective of quantum measurement. To do it, recall 
a series of recent works summarized in jl] . 

4 Quantifying the personality in Knowledge 
Revision scenarios 

In brief, quantum contextuality manifests itself as follows: when measuring 
quantum systems, the result may depend on which other compatible observ- 
ables are measured simultaneously. Furthermore, these other observables 
may be just intended to be measured rather than really measured. This 
cloud of potentially co- measurable values is referred to as context. When 
simulating a quantum system by agents with internal memory (recall that, as 
told above, quantum system are so smart that they behave in this way with- 
out having internal memory), the agent will attain different internal states 
in course of carrying out a sequence of elementary queries. The minimal 
amount of memory needed to simulate particular manifestations of quantum 
contextuality is called memory cost of this quantum effect. The paper jl] ex- 
plores the memory cost of simulating quantum contextuality effect observed 
on singlet states of positronium. It gives a clue to draw a correspondence: 



In general, the memory cost increases as more and more contextuality con- 
straints are considered. The complexity of contextuality constrains depends, 
in its turn, on the dimension of the state space of the system in question. 



quantum contextuality 
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I suggest the following technical idea. The argumentation of the authors 
of [1] is reverted. We start with an IR environment and ask how complex 
quantum contextual features it may exhibit? Furthermore, we may reduce 
the answer to just a number (or a string of numbers), namely, the dimen- 
sionality (or a tensor product structure - TPS [13J) of a quantum system 
demonstrating similar context dependence. 

KR scenario — > quantum system 

How to do this? What is to be simulated? Here, I dwell only on the logical 
and certain probabilistic aspects of simulation. To do it, the proper tools 
to deal with the structure of the collection of properties of a system are 
introduced. 

Overlapping contexts. It was observed by different authors that com- 
plex IR systems are not well described by probabilistic models based on a 
single sample space. In [9] it was explicitly shown that Bayesian reasoning 
in its direct for fails and, in order to get adequate evaluations, when writing 
conditional probabilities P(A \ B) one should take care about specifying the 
context - a particular sample space, in which these conditional probabilities 
are calculated. In the meantime, the small sample spaces are not separated 
- thy overlap, there are events belonging to different contexts. It occurs that 
the classical contingency table 





RETRIEVED 


NOT RETRIEVED 




RELEVANT 


AnB 


AnB 


A 


NON-RELEVANT 


AnB 


AnB 


A 




B 


B 





ceases to be adequate. The reason is that even within a single scenario both A 
and B may belong to different contexts, in particular, A is no longer uniquely 
defined by A (the same to B and B). How to capture this structure? A tool 
of combinatorial nature is needed to describe overlapping contexts. First 
note that a single sample space is structureless, all its elements are equally 
(un)related with each other. In case of overlapping contexts this is no longer 
the case. A graphical (and combinatorial) way to capture such relations was 
suggested by R.Greechie (see [3] for an overview). The idea is to 

(i) consider all the elements of all sample spaces together 
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(ii) label each element with a tag pointing to appropriate context 

The Kolmogorovian probabilities (and hence Bayesian inference) come from 
the fact that the logic of statements about the appropriate sample space is 
Boolean. In case of pasted contexts this is no longer so, the structure of all 
the statements about the IR environment is no longer Boolean. 

How contextuality effects come? Mainly, in the form of Kochen-Specker 
reasoning stating that particular hypothetical probability assignments do not 
exist such as a total probability distribution on the whole diagram viewed as 
a single sample space. The consequence of such results is signaling that the 
evaluation of conditional probabilities based on standard Bayes model will 
be no loner adequate. For examples of such violations in quantum mechanics 
see jl], in IR this also takes place, see, for instance |7]. Quantitatively it 
looks as follows. 



'How much contextuality' ? So far, only qualitative ideas were provided. 
The next step is to try to evaluate them, putting the question 'How much 
contextuality'? A possible transparent answer was recently proposed in [TTj . 
We take a representative sampling of observables, and simply check the ratio 
of the triples, for which Accardi inequalities (EJ) are violated. 

Using the ideas of [11] , the rate of personalization can be evaluated in a 
similar way. First, by random sampling, triples of properties, that is, yes- no 
queries are picked. Then, for each triple, the transition probability matrices 
(J2D are calculated. For each particular sample triple the inequalities flBJ) are 
checked. Then the ratio of samples is calculated: 

„ number of triples violating <M\ , . 

Pers = - * (7) 

total number of sampled triples 



Conclusions 



Vector models of IR become more and more popular, first of all because they 
make it possible to carry out multi-document actions. In this paper I dwell 
on a QIA framework (8]. The basic ingredient of QIA framework is a Hilbert 
space HH called the information need space. In it simplest form, IN space 
is linear space of elementary (atomic) topics. In my approach, I suggest to 
start introducing the IN space to satisfy the necessary amount of capturing 
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contextuality. The ideology of IN space is the closest to that of quantum 
mechanics. In QM, the state space of a system is a space of some internal 
(in the deepest possible sense) features of a system, while the observables 
are expressed in terms of operators and other derived structures on the state 
space. Similar things happen in QIA approach. The pace of information 
needs exists per se, we may treat it as spanned on elementary entities, but this 
will be nothing, but a representation of this space. The source of emergence 
of this space lies in the multicontextual structure described in the previous 
section. Furthermore, as pointed in [6], [7], the correlations, which occur in 
IR environment may even be stronger than quantum ones. In this case a 
straightforward Hilbert space model may fail to work properly, and we may 
call 'foil quantum theories' to grip these situations. 

So far, I was interested in information retrieval situations, when the result 
of a particular action may depend on other actions, which the IR agent 
could in principle do alongside with the actions actually performed. This 
phenomenon is called contextuality, we encounter it in IR, we have to take 
it into account, to work with it. A similar kind of dependence takes place in 
quantum mechanics. 

Information Retrieval 



Quantum Mechanics 
contextuality 



personalization 

The difference is that in QM contextuality appears by itself, not being orig- 
inated by some 'internal mechanisms'. The situations where contextuality 
occurs depend on the state space of the system the structure of observables 
involved. In the realm of QM we can quantitatively evaluate the rate of 
contextuality [TTJ. The origin of contextuality effects in IR stems from per- 
sonalization of query scenarios. The personalization, in turn, can be quan- 
tified by memory resources required to keep tracking the information needs 
of a particular user (note that 'user' in this context might not be a single 
person, nor even a 'person' at all). The idea of this Note was to demonstrate 
that using quantum mechanics formalism, we can quantify the rate of per- 
sonalization in particular IR environments. To do this, I suggest to reverse 
the procedure of estimation of memory cost of quantum contextuality based 
on simulating quantum systems by finite automata. Instead, a KR scenario 
(which as a matter of fact is a sequence of queries upon a finite automaton) 
is suggested to be simulated by appropriate scenario of quantum measure- 
ment, demonstrating the same contextuality features. As a result, a Hilbert 
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space of appropriate quantum system emerges together with a collection of 
observables. This Hilbert space is suggested to play the role of information 
need space, which is developed within QIA (quantum information access) 
framework for Information Retrieval. Technically, the IN space is built start- 
ing from Greechie-like digramss (pasted overlapping contexts, see Section H] 
above, capturing the particular IR environment. QIA framework provides 
more flexible machinery to deal with information needs than any classical 
probabilistic approach by that simple reason that it incorporates the latter. 
But we should be aware that it is not ultimately general. In quantum realm, 
we have non-classical correlations and the present state of our knowledge 
shows that quantum mechanics is enough to explain all them. However, 
IR may in principle provide stronger-than-quantum correlations. For them, 
'foils of quantum theory' - the operational theories, which do not compete 
with quantum mechanics, but generalize it to the extent not demanded in 
modern physics [6] , these theories may be of help in Information Processing. 
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